Detection of Nonstationary Noise and Improved Voice Activity Detection in an Automotive Hands - free Environment
نویسنده
چکیده
Speech processing in the automotive environment is a challenging problem due to the presence of powerful and unpredictable nonstationary noise. This thesis addresses two detection problems involving both nonstationary noise signals and nonstationary desired signals. Two detectors are developed: one to detect passing vehicle noise in the presence of speech and one to detect speech in the presence of passing vehicle noise. The latter is then measured against a state-of-the-art voice activity detector used in telephony. The process of compiling a library of recordings in the automobile to facilitate this research is also detailed. ACKNOWLEDGEMENTS I owe a great deal of thanks to very many people for making my graduate work possible and enjoyable. I’d like to start by thanking my parents for their unwavering support and generous encouragement throughout my academic career and entire life. I’m certain I could not have gotten this far without it. Next I’d like to thank Rick Brown for his commitment, encouragement, assistance and company over an intense three years of electrical and computer engineering. I owe two of those three years to him convincing me that I should go to graduate school and, despite a few rough spots, I’m very glad he did. I don’t think it’s possible for me to quantify everything I’ve learned from him over that time or my appreciation for it. Bose Corporation deserves my thanks for their generosity in supporting me for my first year of graduate school. I’d specifically like to thank Jeff Faneuff for making my relationship with Bose possible. He and Vasu Iyengar also contributed much valuable advice and feedback on this work. I owe Jeremy Slater and David Feinzeig thanks for their support, their company, and their assistance with this project. They’ve made many late nights in the lab rather enjoyable and were anything but hesitant to aid in the data collection effort. Last, but certainly not least, I’d like to thank my committee, Brian King and John McNeill, for their time and their valuable feedback.
منابع مشابه
Improved Voice Activity Detection in the Presence of Passing Vehicle Noise
Voice activity detection (VAD) is an important enabling technology for a variety of speech-based applications including speech recognition, speech encoding, and hands-free telephony. The primary function of a voice activity detector is to provide an indication of speech presence in order to facilitate speech processing as well as possibly provide delimiters for the beginning and end of a speech...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملAutomotive 3-Microphone Noise Canceller in a Frequently Moving Noise Source Environment
A combined three-microphone voice activity detector (VAD) and noise-canceling system is studied to enhance speech recognition in an automobile environment. A previous experiment clearly shows the ability of the composite system to cancel a single noise source outside of a defined zone. This paper investigates the performance of the composite system when there are frequently moving noise sources...
متن کاملDevelopment and evaluation of hands-free spoken dialogue system for railway station guidance
In this paper, we describe development and evaluation of handsfree spoken dialogue system which is used for railway station guidance. In the application at the railway station, noise robustness is the most essential issue for the dialogue system. To address the problem, we introduce two key techniques in our proposed hands-free system; (a) blind spatial subtraction array (BSSA) as a preprocessi...
متن کاملNoise spectrum estimation in adverse environments: improved minima controlled recursive averaging
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper, we present an improved minima controlled recursive averaging (IMCRA) approach, for noise estimation in adverse environments involving nonstationary noise, weak speech components, and low input signal-to-noise ratio (SNR). The noise estimate is obtained by averaging past spec...
متن کامل